[fix][broker] Use compatible Avro name validator to allow '$' in schema record names#25193
[fix][broker] Use compatible Avro name validator to allow '$' in schema record names#25193mattisonchao merged 8 commits intomasterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses a regression introduced by the Avro upgrade (1.12.0) where Protobuf-derived Avro schemas can fail parsing due to $ appearing in generated record names, by introducing a more permissive Avro NameValidator and adding tests to cover the scenario.
Changes:
- Add a custom Avro
NameValidator(CompatibleNameValidator) to allow$in Avro record/field names during schema validation. - Add unit tests for the validator’s behavior and a reproduction test using a generated Protobuf schema.
- Introduce a new Protobuf message (
DataRecord.proto) used by the tests.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/validator/StructSchemaDataValidator.java | Uses a custom Avro NameValidator to accept $ during schema parsing/validation. |
| pulsar-broker/src/test/java/org/apache/pulsar/broker/service/schema/validator/SchemaDataValidatorTest.java | Adds tests for the new validator and a Protobuf-based reproduction. |
| pulsar-broker/src/main/proto/DataRecord.proto | Adds a Protobuf schema used to generate Avro with nested-type $ names for testing. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
...c/main/java/org/apache/pulsar/broker/service/schema/validator/StructSchemaDataValidator.java
Show resolved
Hide resolved
...c/main/java/org/apache/pulsar/broker/service/schema/validator/StructSchemaDataValidator.java
Show resolved
Hide resolved
...src/test/java/org/apache/pulsar/broker/service/schema/validator/SchemaDataValidatorTest.java
Show resolved
Hide resolved
...src/test/java/org/apache/pulsar/broker/service/schema/validator/SchemaDataValidatorTest.java
Outdated
Show resolved
Hide resolved
...c/main/java/org/apache/pulsar/broker/service/schema/validator/StructSchemaDataValidator.java
Show resolved
Hide resolved
…PATIBLE_NAME_VALIDATOR
…ed in tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #25193 +/- ##
============================================
+ Coverage 72.09% 72.63% +0.53%
- Complexity 2303 34426 +32123
============================================
Files 1956 1959 +3
Lines 154938 155418 +480
Branches 17670 17731 +61
============================================
+ Hits 111706 112882 +1176
+ Misses 34233 33546 -687
+ Partials 8999 8990 -9
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
| Schema.Parser parser = | ||
| new Schema.Parser(StructSchemaDataValidator.COMPATIBLE_NAME_VALIDATOR); |
There was a problem hiding this comment.
Please also check the JsonSchemaCompatibilityCheck.java which still use Schema.Parser fromParser = new Schema.Parser(); in isAvroSchema check.
private boolean isAvroSchema(SchemaData schemaData) {
try {
Schema.Parser fromParser = new Schema.Parser();
fromParser.setValidateDefaults(false);
Schema fromSchema = fromParser.parse(new String(schemaData.getData(), UTF_8));
return true;
} catch (Exception e) {
return false;
}
}
There was a problem hiding this comment.
Good catch! I'll fix it too. :))
…bilityCheck Apply the same COMPATIBLE_NAME_VALIDATOR to JsonSchemaCompatibilityCheck's isAvroSchema() method to allow '$' in schema record names, matching the fix already applied to AvroSchemaBasedCompatibilityCheck, SchemaRegistryServiceImpl, and StructSchemaDataValidator in #25193. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…bilityCheck Apply the same COMPATIBLE_NAME_VALIDATOR to JsonSchemaCompatibilityCheck's isAvroSchema() method to allow '$' in schema record names, matching the fix already applied to AvroSchemaBasedCompatibilityCheck, SchemaRegistryServiceImpl, and StructSchemaDataValidator in #25193. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Motivation
After #24617 upgraded Avro to a newer version, the default
Schema.Parser()uses
UTF_VALIDATORwhich rejects the$character in record names. Thisbreaks Protobuf schemas whose Avro representation contains
$in generatednested type names (e.g. inner classes).
Modifications
CompatibleNameValidatorthat allows$in addition to letters,digits, and underscores, matching the previous Avro behavior.
Schema.Parser()instancesthat handle Protobuf schemas:
StructSchemaDataValidatorSchemaRegistryServiceImplAvroSchemaBasedCompatibilityCheckDataRecord.protofor test reproduction.CompatibleNameValidatorand Protobuf schemacompatibility.
Verifying this change
CompatibleNameValidator(valid names, invalid names,edge cases, error messages).
ProtobufSchema<DataRecord>to reproduce the original issue.Documentation
doc-not-needed